Skip to content

BTreeMap::merge optimized#152418

Open
asder8215 wants to merge 14 commits intorust-lang:mainfrom
asder8215:btreemap_merge_optimized
Open

BTreeMap::merge optimized#152418
asder8215 wants to merge 14 commits intorust-lang:mainfrom
asder8215:btreemap_merge_optimized

Conversation

@asder8215
Copy link
Contributor

This is an optimized version of #151981. See ACP for more information on BTreeMap::merge does.

CC @programmerjake. Let me know what you think of how I'm using CursorMut and IntoIter here and whether the unsafe code here looks good. I decided to use ptr::read() and ptr::write() to grab the value from CursorMut as V than &mut V, use it within the conflict function, and overwrite the content of conflicting key afterward.

I know this needs some polishing, especially with refactoring some redundant looking code in a nicer way, some of which could probably just be public API methods for CursorMut. It does pass all the tests that I currently have for BTreeMap::merge (inspired from BTreeMap::append) though, so that's good.

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Feb 10, 2026
@asder8215 asder8215 force-pushed the btreemap_merge_optimized branch from a5a105e to a7fa7dd Compare February 10, 2026 02:59
Comment on lines 1315 to 1322
// SAFETY: We read in self_val's and hand it over to our conflict function
// which will always return a value that we can use to overwrite what's
// in self_val
unsafe {
let val = ptr::read(self_val);
let next_val = (conflict)(self_key, val, first_other_val);
ptr::write(self_val, next_val);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should really be a method of Cursor*, you need a handler for when conflict panics that removes the entry without dropping whatever you ptr::read from.
something like:

impl<K, V> CursorMut<'a, K, V> {
    /// call `f` with the next entry's key and value, replacing the next entry's value with the returned value. if `f` unwinds, the next entry is removed.
    /// equivalent to a more efficient version of:
    /// ```rust
    /// if let Some((k, v)) = self.remove_next() {
    ///     let v = f(&k, v);
    ///     // Safety: key is unmodified
    ///     unsafe { self.insert_after_unchecked(k, v) };
    /// }
    /// ```
    pub(super) fn with_next(&mut self, f: impl FnOnce(&K, V) -> V) {
        struct RemoveNextOnDrop<'a, 'b, K, V> {
            cursor: &'a mut CursorMut<'b, K, V>,
            forget_next: bool,
        }
        impl<K, V> Drop for RemoveNextOnDrop<'_, '_, K, V> {
            fn drop(&mut self) {
                if self.forget_next {
                    // call an equivalent to CursorMut::remove_next()
                    // except that instead of returning `V`, it never moves or drops it.
                    self.0.forget_next_value();
                }
            }
        }
        let mut remove_next_on_drop = RemoveNextOnDrop {
            cursor: self,
            forget_next: false, // we don't know that we have a next value yet
        };
        if let Some((k, v_mut)) = remove_next_on_drop.cursor.peek_next() {
            remove_next_on_drop.forget_next = true;
            // Safety: we move the V out of the next entry,
            // we marked the entry's value to be forgotten
            // when remove_next_on_drop is dropped that
            // way we avoid returning to the caller leaving
            // a moved-out invalid value if `f` unwinds.
            let v = unsafe { std::ptr::read(v_mut) };
            let v = f(k, v);
            // Safety: move the V back into the next entry
            unsafe { std::ptr::write(v_mut, v) };
            remove_next_on_drop.forget_next = false;
        }
    }
}

The equivalent CursorMutKey method should instead have f be impl FnOnce(K, V) -> (K, V) and needs to forget both the key and value since they were both ptr::read, and not just the value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got you 👍

@asder8215
Copy link
Contributor Author

Appreciate your patience and feedback in all of this @programmerjake!

@asder8215
Copy link
Contributor Author

@programmerjake Took your suggestion on with_next() on Cursor* and made and forget_next*() functions. Lmk what you think about this!

Also, I've been wondering if with_next() has uses to be a public accessible function?

let mut emptied_internal_root = false;
if let Ok(next_kv) = current.next_kv() {
let ((_, val), pos) =
next_kv.remove_kv_tracking(|| emptied_internal_root = true, self.alloc.clone());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove_kv_tracking may panic and drop the key and value, tbh this is very complicated and makes me think it could be better for with_next() instead of calling forget_next[_key_and]_value, to transmute the cursor reference type from &mut CursorMut<K, V> to &mut CursorMut<K, ManuallyDrop<V>> and then just call remove_next. that said, idk if that would be sound, asking on Zulip: #t-libs > could we use transmute tricks in the impl of BTreeMap? @ 💬

Copy link
Contributor Author

@asder8215 asder8215 Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, I think transmute should be fine since &mut CursorMut<K, V> and &mut CursorMut<K, ManuallyDrop<V>> are the same size and it forgets the original, so we wouldn't be running the destructor on the original. (but of course I can't hold myself to that since I haven't played around with mem::transmute much myself)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might differ with -Zrandomize-layout or other future struct layout changes. I haven't reviewed the whole BTreeMap implementation to see what layout guarantees it has.

@rustbot
Copy link
Collaborator

rustbot commented Feb 11, 2026

Reminder, once the PR becomes ready for a review, use @rustbot ready.

@asder8215 asder8215 force-pushed the btreemap_merge_optimized branch from 6d826f5 to 97f5547 Compare February 11, 2026 04:25
@asder8215
Copy link
Contributor Author

@programmerjake Made the change, lmk what you think!

@asder8215 asder8215 force-pushed the btreemap_merge_optimized branch from b26e4e1 to 6c436c1 Compare February 12, 2026 21:48
…nstead of drop handling, refactored code a bit, and updated comments
@asder8215 asder8215 force-pushed the btreemap_merge_optimized branch from 6c436c1 to 9b423c5 Compare February 12, 2026 21:56
Copy link
Member

@programmerjake programmerjake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good enough, though I didn't really review the tests.

View changes since this review

@asder8215
Copy link
Contributor Author

asder8215 commented Feb 13, 2026

The tests I took verbatim from append, which seemed thorough enough in examining whether the edges are fixed after insertion, drop behavior, and checking if the key is not overwritten on conflict.

One thing I'll point out is that interestingly this append test:

#[test]
#[cfg_attr(not(panic = "unwind"), ignore = "test requires unwinding support")]
fn test_append_drop_leak() {
    let a = CrashTestDummy::new(0);
    let b = CrashTestDummy::new(1);
    let c = CrashTestDummy::new(2);
    let mut left = BTreeMap::new();
    let mut right = BTreeMap::new();
    left.insert(a.spawn(Panic::Never), ());
    left.insert(b.spawn(Panic::Never), ());
    left.insert(c.spawn(Panic::Never), ());
    right.insert(b.spawn(Panic::InDrop), ()); // first duplicate key, dropped during append
    right.insert(c.spawn(Panic::Never), ());

    catch_unwind(move || left.append(&mut right)).unwrap_err();
    assert_eq!(a.dropped(), 1);
    assert_eq!(b.dropped(), 1); // should be 2 were it not for Rust issue #47949
    assert_eq!(c.dropped(), 2);
}

When I ported a similar version of this for merge:

#[test]
#[cfg_attr(not(panic = "unwind"), ignore = "test requires unwinding support")]
fn test_merge_drop_leak() {
    let a = CrashTestDummy::new(0);
    let b = CrashTestDummy::new(1);
    let c = CrashTestDummy::new(2);
    let mut left = BTreeMap::new();
    let mut right = BTreeMap::new();
    left.insert(a.spawn(Panic::Never), ());
    left.insert(b.spawn(Panic::Never), ());
    left.insert(c.spawn(Panic::Never), ());
    right.insert(b.spawn(Panic::InDrop), ()); // first duplicate key, dropped during append
    right.insert(c.spawn(Panic::Never), ());

    catch_unwind(move || left.merge(right, |_, _, _| ())).unwrap_err();
    assert_eq!(a.dropped(), 1);
    assert_eq!(b.dropped(), 2);
    assert_eq!(c.dropped(), 2);
}

The assert_eq!(b.dropped(), #) gave me 2 instead of 1 (which is expected behavior it seems).

@asder8215
Copy link
Contributor Author

I added a panic test within conflict, but I think due to #47949, it isn't incrementing the counter for b_val* and c_val* on drop. The left BTreeMap does have a length of 1 though.

@asder8215 asder8215 marked this pull request as ready for review February 15, 2026 22:22
@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 15, 2026
@rustbot rustbot removed the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Feb 15, 2026
@rustbot
Copy link
Collaborator

rustbot commented Feb 15, 2026

r? @jhpratt

rustbot has assigned @jhpratt.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

Why was this reviewer chosen?

The reviewer was selected based on:

  • Owners of files modified in this PR: libs
  • libs expanded to 7 candidates
  • Random selection from Mark-Simulacrum, jhpratt, joboet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants